30 research outputs found
Publishing Efficient On-device Models Increases Adversarial Vulnerability
Recent increases in the computational demands of deep neural networks (DNNs)
have sparked interest in efficient deep learning mechanisms, e.g., quantization
or pruning. These mechanisms enable the construction of a small, efficient
version of commercial-scale models with comparable accuracy, accelerating their
deployment to resource-constrained devices.
In this paper, we study the security considerations of publishing on-device
variants of large-scale models. We first show that an adversary can exploit
on-device models to make attacking the large models easier. In evaluations
across 19 DNNs, by exploiting the published on-device models as a transfer
prior, the adversarial vulnerability of the original commercial-scale models
increases by up to 100x. We then show that the vulnerability increases as the
similarity between a full-scale and its efficient model increase. Based on the
insights, we propose a defense, -, that fine-tunes
on-device models with the objective of reducing the similarity. We evaluated
our defense on all the 19 DNNs and found that it reduces the transferability up
to 90% and the number of queries required by a factor of 10-100x. Our results
suggest that further research is needed on the security (or even privacy)
threats caused by publishing those efficient siblings.Comment: Accepted to IEEE SaTML 202
Handcrafted Backdoors in Deep Neural Networks
Deep neural networks (DNNs), while accurate, are expensive to train. Many
practitioners, therefore, outsource the training process to third parties or
use pre-trained DNNs. This practice makes DNNs vulnerable to
: the third party who trains the model may act maliciously to inject
hidden behaviors into the otherwise accurate model. Until now, the mechanism to
inject backdoors has been limited to .
We argue that such a supply-chain attacker has more attack techniques
available. To study this hypothesis, we introduce a handcrafted attack that
directly manipulates the parameters of a pre-trained model to inject backdoors.
Our handcrafted attacker has more degrees of freedom in manipulating model
parameters than poisoning. This makes it difficult for a defender to identify
or remove the manipulations with straightforward methods, such as statistical
analysis, adding random noises to model parameters, or clipping their values
within a certain range. Further, our attacker can combine the handcrafting
process with additional techniques, , jointly optimizing a trigger
pattern, to inject backdoors into complex networks effectivelythe
meet-in-the-middle attack.
In evaluations, our handcrafted backdoors remain effective across four
datasets and four network architectures with a success rate above 96%. Our
backdoored models are resilient to both parameter-level backdoor removal
techniques and can evade existing defenses by slightly changing the backdoor
attack configurations. Moreover, we demonstrate the feasibility of suppressing
unwanted behaviors otherwise caused by poisoning. Our results suggest that
further research is needed for understanding the complete space of supply-chain
backdoor attacks.Comment: 16 pages, 13 figures, 11 table
Differentially Private Image Classification from Features
Leveraging transfer learning has recently been shown to be an effective
strategy for training large models with Differential Privacy (DP). Moreover,
somewhat surprisingly, recent works have found that privately training just the
last layer of a pre-trained model provides the best utility with DP. While past
studies largely rely on algorithms like DP-SGD for training large models, in
the specific case of privately learning from features, we observe that
computational burden is low enough to allow for more sophisticated optimization
schemes, including second-order methods. To that end, we systematically explore
the effect of design parameters such as loss function and optimization
algorithm. We find that, while commonly used logistic regression performs
better than linear regression in the non-private setting, the situation is
reversed in the private setting. We find that linear regression is much more
effective than logistic regression from both privacy and computational aspects,
especially at stricter epsilon values (). On the optimization
side, we also explore using Newton's method, and find that second-order
information is quite helpful even with privacy, although the benefit
significantly diminishes with stricter privacy guarantees. While both methods
use second-order information, least squares is effective at lower epsilons
while Newton's method is effective at larger epsilon values. To combine the
benefits of both, we propose a novel algorithm called DP-FC, which leverages
feature covariance instead of the Hessian of the logistic regression loss and
performs well across all values we tried. With this, we obtain new
SOTA results on ImageNet-1k, CIFAR-100 and CIFAR-10 across all values of
typically considered. Most remarkably, on ImageNet-1K, we obtain
top-1 accuracy of 88\% under (8, )-DP and 84.3\% under (0.1, )-DP
RETVec: Resilient and Efficient Text Vectorizer
This paper describes RETVec, an efficient, resilient, and multilingual text
vectorizer designed for neural-based text processing. RETVec combines a novel
character encoding with an optional small embedding model to embed words into a
256-dimensional vector space. The RETVec embedding model is pre-trained using
pair-wise metric learning to be robust against typos and character-level
adversarial attacks. In this paper, we evaluate and compare RETVec to
state-of-the-art vectorizers and word embeddings on popular model architectures
and datasets. These comparisons demonstrate that RETVec leads to competitive,
multilingual models that are significantly more resilient to typos and
adversarial text attacks. RETVec is available under the Apache 2 license at
https://github.com/google-research/retvec.Comment: Accepted at NeurIPS 202
Quaternion-Based Self-Attentive Long Short-Term User Preference Encoding for Recommendation
Quaternion space has brought several benefits over the traditional Euclidean
space: Quaternions (i) consist of a real and three imaginary components,
encouraging richer representations; (ii) utilize Hamilton product which better
encodes the inter-latent interactions across multiple Quaternion components;
and (iii) result in a model with smaller degrees of freedom and less prone to
overfitting. Unfortunately, most of the current recommender systems rely on
real-valued representations in Euclidean space to model either user's long-term
or short-term interests. In this paper, we fully utilize Quaternion space to
model both user's long-term and short-term preferences. We first propose a
QUaternion-based self-Attentive Long term user Encoding (QUALE) to study the
user's long-term intents. Then, we propose a QUaternion-based self-Attentive
Short term user Encoding (QUASE) to learn the user's short-term interests. To
enhance our models' capability, we propose to fuse QUALE and QUASE into one
model, namely QUALSE, by using a Quaternion-based gating mechanism. We further
develop Quaternion-based Adversarial learning along with the Bayesian
Personalized Ranking (QABPR) to improve our model's robustness. Extensive
experiments on six real-world datasets show that our fused QUALSE model
outperformed 11 state-of-the-art baselines, improving 8.43% at HIT@1 and 10.27%
at NDCG@1 on average compared with the best baseline